Search CORE

6 research outputs found

Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems

Author: Gasic Milica
Mrksic Nikola
Su Pei-Hao
Vandyke David
Wen Tsung-Hsien
Young Steve
Publication venue
Publication date: 01/01/2015
Field of study

Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact both on usability and perceived quality. Most NLG systems in common use employ rules and heuristics and tend to generate rigid and stylised responses without the natural variation of human language. They are also not easily scaled to systems covering multiple domains and languages. This paper presents a statistical language generator based on a semantically controlled Long Short-term Memory (LSTM) structure. The LSTM generator can learn from unaligned data by jointly optimising sentence planning and surface realisation using a simple cross entropy training criterion, and language variation can be easily achieved by sampling from output candidates. With fewer heuristics, an objective evaluation in two differing test domains showed the proposed method improved performance compared to previous methods. Human judges scored the LSTM system higher on informativeness and naturalness and overall preferred it to the other systems.Comment: To be appear in EMNLP 201

arXiv.org e-Print Archive

Crossref

Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

Author: Gasic Milica
Mrksic Nikola
Su Pei-Hao
Vandyke David
Wen Tsung-Hsien
Young Steve
Publication venue
Publication date: 01/01/2015
Field of study

Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users where learning costs are expensive. Reward shaping is one promising technique for addressing these concerns. Here we examine three recurrent neural network (RNN) approaches for providing reward shaping information in addition to the primary (task-orientated) environmental feedback. These RNNs are trained on returns from dialogues generated by a simulated user and attempt to diffuse the overall evaluation of the dialogue back down to the turn level to guide the agent towards good behaviour faster. In both simulated and real user scenarios these RNNs are shown to increase policy learning speed. Importantly, they do not require prior knowledge of the user's goal.Comment: Accepted for publication in SigDial 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Research data supporting "Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems"

Author: Gasic Milica
Mrksic Nikola
Su Pei-Hao
Vandyke David
Wen Tsung-Hsien
Young Steve
Publication venue: Department of Zoology, University of Cambridge
Publication date: 06/10/2015
Field of study

This dataset is in JSON format and contains log files of interactions between a turn-taking spoken dialogue system and Amazon Mechanical turkers, collected from our previous live trials. It includes two application domains: San Francisco restaurants and hotels, each of them has around 1000 logs. The user responses are 1-best ASR hypothesis recognised by our ASR system, and the system responses were collected by running another round of data collection on AMT. The number of total collected system responses is around 5.1K for each domain. All users are anonymous.This record supports publication and is available at http://mi.eng.cam.ac.uk/~thw28/papers/EMNLP15.pdf“This work was supported by the Toshiba Research Europe, Cambridge Research Laboratory [grant number RG74649]

Apollo (Cambridge)

Recommended from our members

Research Data Supporting "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems"

Author: Gasic Milica
Mrksic Nikola
Pei-Hao Su
R.-Barahona Lina M.
Tsung-Hsien Wen
Vandyke David
Young Steve
Publication venue: Department of Zoology, University of Cambridge
Publication date: 01/01/2016
Field of study

This is a natural language generation dataset collected from Amazon Mechanical Turk used in this paper "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems" in NAACL-HLT 2016. It contains two domains regarding to consumer electronics: laptop and TV. Each file is in JSON format and contains a list of tuples. The three elements are dialogue act (semantic representation), sentence generated from AMT workers, sentence generated from our handcrafted template generator. There are 13K and 7K distinct DA and sentence pairs in laptop and TV domain, respectively. All products are anonymous.Toshiba Research Europe Ltd, Cambridge Research Laborator

Apollo (Cambridge)

Recommended from our members

Research data supporting "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems"

Author: Gasic Milica
Mrksic Nikola
Rojas-Barahona Lina
Su Pei-Hao
Ultes Stefan
Vandyke David
Wen Tsung-Hsien
Young Steve
Publication venue: Department of Zoology, University of Cambridge
Publication date: 17/05/2016
Field of study

This repository contains the data presented in the paper "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems" in ACL 2016. Two separate datasets as described in section 4 of the paper are presented: 1. DialogueEmbedding/ It contains the [train|valid|test] data for the unsupervised dialogue embedding creation, each with *.[feature|reward|turn|subjsuc]. Note that *.turn includes the lines to be read for each dialogue in *.[feature|reward|subjsuc], and *.subjsuc is the user's subjective rating. The feature size is 74. 2. DialoguePolicy/ It includes four contrasting systems with different reward models: [GP|RNN|ObjSubj|Subj]. Inside each system directory is the data obtained in interaction with Amazon Mechanical Turk users while training three policies with same config: policy_[1|2|3]. and a .csv for the evaluation result along with the trainig process. In each policy_[1|2|3]/ there is a list of calls with a time stamp in the name which contains session.xml file for dialogue log and feedback.xml file for user feedbackThis research data supports "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems" which has been published in "Proceedings of Association for Computational Linguistics (ACL)".This work was supported by the EPSRC [grant number Cambridge Trust]

Apollo (Cambridge)